Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 3187 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 298.9 KiB |
| Average record size in memory | 96.0 B |
Variable types
| NUM | 9 |
|---|---|
| CAT | 3 |
Reproduction
| Analysis started | 2020-07-12 14:14:43.037704 |
|---|---|
| Analysis finished | 2020-07-12 14:14:57.626196 |
| Duration | 14.59 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
loc.details has a high cardinality: 290 distinct values | High cardinality |
location has a high cardinality: 1378 distinct values | High cardinality |
deposit_amount_2012 is highly correlated with deposit_amount_2011 and 3 other fields | High correlation |
deposit_amount_2011 is highly correlated with deposit_amount_2012 and 2 other fields | High correlation |
deposit_amount_2013 is highly correlated with deposit_amount_2011 and 5 other fields | High correlation |
deposit_amount_2014 is highly correlated with deposit_amount_2011 and 5 other fields | High correlation |
deposit_amount_2015 is highly correlated with deposit_amount_2012 and 4 other fields | High correlation |
deposit_amount_2016 is highly correlated with deposit_amount_2013 and 3 other fields | High correlation |
deposit_amount_2017 is highly correlated with deposit_amount_2013 and 3 other fields | High correlation |
id has unique values | Unique |
| Distinct count | 3187 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1934.8249137119549 |
|---|---|
| Minimum | 1 |
| Maximum | 3772 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 188.3 |
| Q1 | 935.5 |
| median | 2059 |
| Q3 | 2879.5 |
| 95-th percentile | 3558.7 |
| Maximum | 3772 |
| Range | 3771 |
| Interquartile range (IQR) | 1944 |
Descriptive statistics
| Standard deviation | 1095.640799 |
|---|---|
| Coefficient of variation (CV) | 0.5662738736 |
| Kurtosis | -1.244757168 |
| Mean | 1934.824914 |
| Median Absolute Deviation (MAD) | 967 |
| Skewness | -0.1229164455 |
| Sum | 6166287 |
| Variance | 1200428.76 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 597 | 1 | < 0.1% | |
| 593 | 1 | < 0.1% | |
| 2640 | 1 | < 0.1% | |
| 589 | 1 | < 0.1% | |
| 2636 | 1 | < 0.1% | |
| 585 | 1 | < 0.1% | |
| 2632 | 1 | < 0.1% | |
| 581 | 1 | < 0.1% | |
| 2628 | 1 | < 0.1% | |
| Other values (3177) | 3177 | 99.7% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 3772 | 1 | < 0.1% | |
| 3768 | 1 | < 0.1% | |
| 3767 | 1 | < 0.1% | |
| 3766 | 1 | < 0.1% | |
| 3765 | 1 | < 0.1% |
| Distinct count | 2950 |
|---|---|
| Unique (%) | 92.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45305.863131471604 |
|---|---|
| Minimum | 156.0 |
| Maximum | 159399.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.9 KiB |
Quantile statistics
| Minimum | 156 |
|---|---|
| 5-th percentile | 3671.7 |
| Q1 | 14126.85 |
| median | 36636 |
| Q3 | 67104.45 |
| 95-th percentile | 119540.85 |
| Maximum | 159399 |
| Range | 159243 |
| Interquartile range (IQR) | 52977.6 |
Descriptive statistics
| Standard deviation | 36504.93071 |
|---|---|
| Coefficient of variation (CV) | 0.8057440735 |
| Kurtosis | -0.07808347874 |
| Mean | 45305.86313 |
| Median Absolute Deviation (MAD) | 25084.5 |
| Skewness | 0.8627623671 |
| Sum | 144389785.8 |
| Variance | 1332609966 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2973.6 | 14 | 0.4% | |
| 3786 | 10 | 0.3% | |
| 4476.9 | 9 | 0.3% | |
| 3930.6 | 8 | 0.3% | |
| 3595.8 | 8 | 0.3% | |
| 2461.8 | 7 | 0.2% | |
| 3841.5 | 5 | 0.2% | |
| 4579.8 | 5 | 0.2% | |
| 3257.7 | 5 | 0.2% | |
| 3111.9 | 4 | 0.1% | |
| Other values (2940) | 3112 | 97.6% |
| Value | Count | Frequency (%) | |
| 156 | 1 | < 0.1% | |
| 172.5 | 1 | < 0.1% | |
| 274.5 | 1 | < 0.1% | |
| 562.5 | 1 | < 0.1% | |
| 574.5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 159399 | 1 | < 0.1% | |
| 158926.5 | 1 | < 0.1% | |
| 157300.5 | 1 | < 0.1% | |
| 156943.5 | 1 | < 0.1% | |
| 156646.5 | 1 | < 0.1% |
| Distinct count | 2988 |
|---|---|
| Unique (%) | 93.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 48864.11700658927 |
|---|---|
| Minimum | 117.0 |
| Maximum | 156289.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.9 KiB |
Quantile statistics
| Minimum | 117 |
|---|---|
| 5-th percentile | 3455.79 |
| Q1 | 17355.75 |
| median | 41398.5 |
| Q3 | 71778 |
| 95-th percentile | 123168.3 |
| Maximum | 156289.5 |
| Range | 156172.5 |
| Interquartile range (IQR) | 54422.25 |
Descriptive statistics
| Standard deviation | 37318.30671 |
|---|---|
| Coefficient of variation (CV) | 0.763715974 |
| Kurtosis | -0.2678550215 |
| Mean | 48864.11701 |
| Median Absolute Deviation (MAD) | 26448 |
| Skewness | 0.7433380045 |
| Sum | 155729940.9 |
| Variance | 1392656016 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3420.3 | 20 | 0.6% | |
| 2315.1 | 10 | 0.3% | |
| 10397.1 | 8 | 0.3% | |
| 5062.8 | 8 | 0.3% | |
| 3845.1 | 6 | 0.2% | |
| 4722.3 | 5 | 0.2% | |
| 3442.5 | 5 | 0.2% | |
| 2613 | 5 | 0.2% | |
| 6099 | 5 | 0.2% | |
| 5127 | 4 | 0.1% | |
| Other values (2978) | 3111 | 97.6% |
| Value | Count | Frequency (%) | |
| 117 | 1 | < 0.1% | |
| 180 | 1 | < 0.1% | |
| 213 | 1 | < 0.1% | |
| 214.5 | 1 | < 0.1% | |
| 298.5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 156289.5 | 1 | < 0.1% | |
| 154740 | 1 | < 0.1% | |
| 154695 | 1 | < 0.1% | |
| 154348.5 | 1 | < 0.1% | |
| 153324 | 1 | < 0.1% |
| Distinct count | 3064 |
|---|---|
| Unique (%) | 96.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 56306.26501411986 |
|---|---|
| Minimum | 82.5 |
| Maximum | 192361.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.9 KiB |
Quantile statistics
| Minimum | 82.5 |
|---|---|
| 5-th percentile | 5382.72 |
| Q1 | 24257.25 |
| median | 48889.5 |
| Q3 | 81142.5 |
| 95-th percentile | 133243.05 |
| Maximum | 192361.5 |
| Range | 192279 |
| Interquartile range (IQR) | 56885.25 |
Descriptive statistics
| Standard deviation | 39592.22015 |
|---|---|
| Coefficient of variation (CV) | 0.7031583455 |
| Kurtosis | -0.3370600761 |
| Mean | 56306.26501 |
| Median Absolute Deviation (MAD) | 27676.5 |
| Skewness | 0.6836752758 |
| Sum | 179448066.6 |
| Variance | 1567543896 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3962.1 | 10 | 0.3% | |
| 1543.2 | 8 | 0.3% | |
| 3132.3 | 7 | 0.2% | |
| 8201.4 | 6 | 0.2% | |
| 6541.2 | 6 | 0.2% | |
| 3147 | 5 | 0.2% | |
| 3786 | 4 | 0.1% | |
| 5737.2 | 4 | 0.1% | |
| 2328.9 | 4 | 0.1% | |
| 7769.4 | 3 | 0.1% | |
| Other values (3054) | 3130 | 98.2% |
| Value | Count | Frequency (%) | |
| 82.5 | 1 | < 0.1% | |
| 142.5 | 1 | < 0.1% | |
| 156 | 1 | < 0.1% | |
| 277.5 | 1 | < 0.1% | |
| 424.5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 192361.5 | 1 | < 0.1% | |
| 178746 | 1 | < 0.1% | |
| 165862.5 | 1 | < 0.1% | |
| 164914.5 | 1 | < 0.1% | |
| 163101 | 1 | < 0.1% |
| Distinct count | 3097 |
|---|---|
| Unique (%) | 97.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 63760.04581110762 |
|---|---|
| Minimum | 108.0 |
| Maximum | 192744.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.9 KiB |
Quantile statistics
| Minimum | 108 |
|---|---|
| 5-th percentile | 8998.74 |
| Q1 | 31295.25 |
| median | 55180.5 |
| Q3 | 90468.75 |
| 95-th percentile | 143629.05 |
| Maximum | 192744 |
| Range | 192636 |
| Interquartile range (IQR) | 59173.5 |
Descriptive statistics
| Standard deviation | 41509.51272 |
|---|---|
| Coefficient of variation (CV) | 0.6510270215 |
| Kurtosis | -0.3842805608 |
| Mean | 63760.04581 |
| Median Absolute Deviation (MAD) | 28107 |
| Skewness | 0.6530100331 |
| Sum | 203203266 |
| Variance | 1723039646 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2909.7 | 6 | 0.2% | |
| 2368.8 | 5 | 0.2% | |
| 8108.7 | 4 | 0.1% | |
| 7188 | 3 | 0.1% | |
| 1695 | 3 | 0.1% | |
| 51912 | 3 | 0.1% | |
| 2277.3 | 3 | 0.1% | |
| 52413 | 3 | 0.1% | |
| 77794.5 | 2 | 0.1% | |
| 45939 | 2 | 0.1% | |
| Other values (3087) | 3153 | 98.9% |
| Value | Count | Frequency (%) | |
| 108 | 1 | < 0.1% | |
| 274.5 | 1 | < 0.1% | |
| 364.5 | 1 | < 0.1% | |
| 487.5 | 1 | < 0.1% | |
| 522 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 192744 | 1 | < 0.1% | |
| 183061.5 | 1 | < 0.1% | |
| 179148 | 1 | < 0.1% | |
| 177333 | 1 | < 0.1% | |
| 177036 | 1 | < 0.1% |
| Distinct count | 3142 |
|---|---|
| Unique (%) | 98.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 72616.43109507374 |
|---|---|
| Minimum | 1218.0 |
| Maximum | 231750.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.9 KiB |
Quantile statistics
| Minimum | 1218 |
|---|---|
| 5-th percentile | 14710.65 |
| Q1 | 37819.5 |
| median | 63582 |
| Q3 | 101347.5 |
| 95-th percentile | 156677.1 |
| Maximum | 231750 |
| Range | 230532 |
| Interquartile range (IQR) | 63528 |
Descriptive statistics
| Standard deviation | 43995.33058 |
|---|---|
| Coefficient of variation (CV) | 0.6058591688 |
| Kurtosis | -0.3119043579 |
| Mean | 72616.4311 |
| Median Absolute Deviation (MAD) | 29440.5 |
| Skewness | 0.6683973673 |
| Sum | 231428565.9 |
| Variance | 1935589113 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 5850.6 | 5 | 0.2% | |
| 8505 | 2 | 0.1% | |
| 45903 | 2 | 0.1% | |
| 61992 | 2 | 0.1% | |
| 25114.5 | 2 | 0.1% | |
| 102274.5 | 2 | 0.1% | |
| 111298.5 | 2 | 0.1% | |
| 21894 | 2 | 0.1% | |
| 142119 | 2 | 0.1% | |
| 72429 | 2 | 0.1% | |
| Other values (3132) | 3164 | 99.3% |
| Value | Count | Frequency (%) | |
| 1218 | 1 | < 0.1% | |
| 1369.5 | 1 | < 0.1% | |
| 1504.5 | 1 | < 0.1% | |
| 1575 | 1 | < 0.1% | |
| 1704 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 231750 | 1 | < 0.1% | |
| 222052.5 | 1 | < 0.1% | |
| 201510 | 1 | < 0.1% | |
| 201225 | 1 | < 0.1% | |
| 199099.5 | 1 | < 0.1% |
| Distinct count | 3140 |
|---|---|
| Unique (%) | 98.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 82683.86037025416 |
|---|---|
| Minimum | 6502.5 |
| Maximum | 268407.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.9 KiB |
Quantile statistics
| Minimum | 6502.5 |
|---|---|
| 5-th percentile | 20894.4 |
| Q1 | 45667.5 |
| median | 72400.5 |
| Q3 | 112866 |
| 95-th percentile | 175002.6 |
| Maximum | 268407 |
| Range | 261904.5 |
| Interquartile range (IQR) | 67198.5 |
Descriptive statistics
| Standard deviation | 47620.2991 |
|---|---|
| Coefficient of variation (CV) | 0.5759322181 |
| Kurtosis | -0.1483576894 |
| Mean | 82683.86037 |
| Median Absolute Deviation (MAD) | 31339.5 |
| Skewness | 0.7310038941 |
| Sum | 263513463 |
| Variance | 2267692887 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 46591.5 | 2 | 0.1% | |
| 75835.5 | 2 | 0.1% | |
| 77362.5 | 2 | 0.1% | |
| 139911 | 2 | 0.1% | |
| 80713.5 | 2 | 0.1% | |
| 33325.5 | 2 | 0.1% | |
| 86626.5 | 2 | 0.1% | |
| 72898.5 | 2 | 0.1% | |
| 110755.5 | 2 | 0.1% | |
| 66612 | 2 | 0.1% | |
| Other values (3130) | 3167 | 99.4% |
| Value | Count | Frequency (%) | |
| 6502.5 | 1 | < 0.1% | |
| 6990 | 1 | < 0.1% | |
| 7714.5 | 1 | < 0.1% | |
| 8109 | 1 | < 0.1% | |
| 8133 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 268407 | 1 | < 0.1% | |
| 249486 | 1 | < 0.1% | |
| 245535 | 1 | < 0.1% | |
| 239257.5 | 1 | < 0.1% | |
| 234763.5 | 1 | < 0.1% |
| Distinct count | 290 |
|---|---|
| Unique (%) | 9.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 24.9 KiB |
| Los Angeles | 185 |
|---|---|
| Cook | 139 |
| Orange | 102 |
| Harris | 93 |
| Maricopa | 88 |
| Other values (285) |
| Value | Count | Frequency (%) | |
| Los Angeles | 185 | 5.8% | |
| Cook | 139 | 4.4% | |
| Orange | 102 | 3.2% | |
| Harris | 93 | 2.9% | |
| Maricopa | 88 | 2.8% | |
| King | 73 | 2.3% | |
| San Diego | 71 | 2.2% | |
| Clark | 69 | 2.2% | |
| Miami-Dade | 61 | 1.9% | |
| Marion | 56 | 1.8% | |
| Other values (280) | 2250 | 70.6% |
Length
| Max length | 16 |
|---|---|
| Median length | 7 |
| Mean length | 7.37307813 |
| Min length | 3 |
| Distinct count | 1378 |
|---|---|
| Unique (%) | 43.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 24.9 KiB |
| Chicago | 85 |
|---|---|
| Houston | 64 |
| Indianapolis | 44 |
| New York City | 43 |
| Los Angeles | 40 |
| Other values (1373) |
| Value | Count | Frequency (%) | |
| Chicago | 85 | 2.7% | |
| Houston | 64 | 2.0% | |
| Indianapolis | 44 | 1.4% | |
| New York City | 43 | 1.3% | |
| Los Angeles | 40 | 1.3% | |
| Las Vegas | 33 | 1.0% | |
| Miami | 33 | 1.0% | |
| San Antonio | 29 | 0.9% | |
| Seattle | 29 | 0.9% | |
| San Francisco | 26 | 0.8% | |
| Other values (1368) | 2761 | 86.6% |
Length
| Max length | 22 |
|---|---|
| Median length | 9 |
| Mean length | 9.068716661 |
| Min length | 4 |
state
Categorical
| Distinct count | 22 |
|---|---|
| Unique (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 24.9 KiB |
| CA | |
|---|---|
| TX | |
| NY | |
| FL | |
| IL | 219 |
| Other values (17) |
| Value | Count | Frequency (%) | |
| CA | 719 | 22.6% | |
| TX | 341 | 10.7% | |
| NY | 337 | 10.6% | |
| FL | 337 | 10.6% | |
| IL | 219 | 6.9% | |
| WA | 190 | 6.0% | |
| NJ | 167 | 5.2% | |
| IN | 158 | 5.0% | |
| OR | 111 | 3.5% | |
| AZ | 110 | 3.5% | |
| Other values (12) | 498 | 15.6% |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
| Distinct count | 3140 |
|---|---|
| Unique (%) | 98.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 124025.79055538123 |
|---|---|
| Minimum | 9753.75 |
| Maximum | 402610.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.9 KiB |
Quantile statistics
| Minimum | 9753.75 |
|---|---|
| 5-th percentile | 31341.6 |
| Q1 | 68501.25 |
| median | 108600.75 |
| Q3 | 169299 |
| 95-th percentile | 262503.9 |
| Maximum | 402610.5 |
| Range | 392856.75 |
| Interquartile range (IQR) | 100797.75 |
Descriptive statistics
| Standard deviation | 71430.44865 |
|---|---|
| Coefficient of variation (CV) | 0.5759322181 |
| Kurtosis | -0.1483576894 |
| Mean | 124025.7906 |
| Median Absolute Deviation (MAD) | 47009.25 |
| Skewness | 0.7310038941 |
| Sum | 395270194.5 |
| Variance | 5102308995 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 74751.75 | 2 | 0.1% | |
| 95379.75 | 2 | 0.1% | |
| 49988.25 | 2 | 0.1% | |
| 69887.25 | 2 | 0.1% | |
| 66321 | 2 | 0.1% | |
| 100030.5 | 2 | 0.1% | |
| 53579.25 | 2 | 0.1% | |
| 82383.75 | 2 | 0.1% | |
| 151852.5 | 2 | 0.1% | |
| 87833.25 | 2 | 0.1% | |
| Other values (3130) | 3167 | 99.4% |
| Value | Count | Frequency (%) | |
| 9753.75 | 1 | < 0.1% | |
| 10485 | 1 | < 0.1% | |
| 11571.75 | 1 | < 0.1% | |
| 12163.5 | 1 | < 0.1% | |
| 12199.5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 402610.5 | 1 | < 0.1% | |
| 374229 | 1 | < 0.1% | |
| 368302.5 | 1 | < 0.1% | |
| 358886.25 | 1 | < 0.1% | |
| 352145.25 | 1 | < 0.1% |
age_of_bank
Real number (ℝ≥0)
| Distinct count | 111 |
|---|---|
| Unique (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 71.37088170693443 |
|---|---|
| Minimum | 1 |
| Maximum | 191 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 24 |
| median | 97 |
| Q3 | 97 |
| 95-th percentile | 97 |
| Maximum | 191 |
| Range | 190 |
| Interquartile range (IQR) | 73 |
Descriptive statistics
| Standard deviation | 38.98606483 |
|---|---|
| Coefficient of variation (CV) | 0.5462460866 |
| Kurtosis | -1.028681742 |
| Mean | 71.37088171 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -0.7697950624 |
| Sum | 227459 |
| Variance | 1519.913251 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 97 | 1861 | 58.4% | |
| 5 | 93 | 2.9% | |
| 13 | 65 | 2.0% | |
| 12 | 63 | 2.0% | |
| 8 | 60 | 1.9% | |
| 6 | 56 | 1.8% | |
| 82 | 55 | 1.7% | |
| 4 | 53 | 1.7% | |
| 14 | 50 | 1.6% | |
| 9 | 49 | 1.5% | |
| Other values (101) | 782 | 24.5% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 2 | 28 | 0.9% | |
| 3 | 41 | 1.3% | |
| 4 | 53 | 1.7% | |
| 5 | 93 | 2.9% |
| Value | Count | Frequency (%) | |
| 191 | 1 | < 0.1% | |
| 167 | 1 | < 0.1% | |
| 163 | 1 | < 0.1% | |
| 162 | 2 | 0.1% | |
| 161 | 2 | 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| id | deposit_amount_2011 | deposit_amount_2012 | deposit_amount_2013 | deposit_amount_2014 | deposit_amount_2015 | deposit_amount_2016 | loc.details | location | state | deposit_amount_2017 | age_of_bank | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 32079.0 | 35971.5 | 37237.5 | 40362.0 | 46021.5 | 46020.0 | Waukesha | Wales | WI | 69030.00 | 106 |
| 1 | 2 | 83181.0 | 84846.0 | 97098.0 | 110284.5 | 122035.5 | 133905.0 | Washington | Germantown | WI | 200857.50 | 97 |
| 2 | 4 | 68511.0 | 73932.0 | 79876.5 | 105603.0 | 112113.0 | 110755.5 | Waukesha | Pewaukee | WI | 166133.25 | 97 |
| 3 | 5 | 96271.5 | 108325.5 | 104880.0 | 121054.5 | 113956.5 | 109837.5 | Waukesha | Waukesha | WI | 164756.25 | 97 |
| 4 | 6 | 93837.0 | 101592.0 | 118270.5 | 140280.0 | 150987.0 | 168742.5 | Waukesha | Waukesha | WI | 253113.75 | 97 |
| 5 | 8 | 126933.0 | 144072.0 | 155919.0 | 164754.0 | 181075.5 | 184749.0 | Waukesha | New Berlin | WI | 277123.50 | 56 |
| 6 | 9 | 72700.5 | 73044.0 | 82053.0 | 85413.0 | 83767.5 | 87390.0 | Waukesha | Oconomowoc | WI | 131085.00 | 84 |
| 7 | 10 | 73921.5 | 73033.5 | 73011.0 | 78331.5 | 80385.0 | 83619.0 | Waukesha | Butler | WI | 125428.50 | 97 |
| 8 | 11 | 46113.0 | 47869.5 | 49678.5 | 62046.0 | 68752.5 | 82890.0 | Waukesha | Muskego | WI | 124335.00 | 97 |
| 9 | 12 | 44221.5 | 46537.5 | 52206.0 | 60166.5 | 63582.0 | 70984.5 | Waukesha | Waukesha | WI | 106476.75 | 33 |
Last rows
| id | deposit_amount_2011 | deposit_amount_2012 | deposit_amount_2013 | deposit_amount_2014 | deposit_amount_2015 | deposit_amount_2016 | loc.details | location | state | deposit_amount_2017 | age_of_bank | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3177 | 3751 | 4476.9 | 5062.8 | 5737.2 | 7188.0 | 5850.6 | 9285.0 | Cobb | Atlanta | GA | 13927.50 | 97 |
| 3178 | 3752 | 5608.8 | 7643.7 | 9831.0 | 9504.0 | 10885.2 | 14848.5 | Somerset | Bernardsville | NJ | 22272.75 | 97 |
| 3179 | 3756 | 3629.7 | 4454.1 | 6974.7 | 7107.3 | 7743.9 | 11262.0 | Sumter | The Villages | FL | 16893.00 | 1 |
| 3180 | 3763 | 4476.9 | 5062.8 | 5737.2 | 7188.0 | 5850.6 | 9238.5 | Palm Beach | Palm Beach Gardens | FL | 13857.75 | 97 |
| 3181 | 3764 | 4476.9 | 5062.8 | 5737.2 | 7188.0 | 5850.6 | 9229.5 | Duval | Jacksonville | FL | 13844.25 | 2 |
| 3182 | 3765 | 3113.1 | 5125.5 | 7568.7 | 8907.3 | 9931.5 | 12067.5 | Cook | Glencoe | IL | 18101.25 | 2 |
| 3183 | 3766 | 3860.1 | 5302.8 | 7048.5 | 8007.0 | 9378.9 | 12453.0 | Westchester | Bedford Hills | NY | 18679.50 | 2 |
| 3184 | 3767 | 4476.9 | 5062.8 | 4995.0 | 5373.6 | 5524.5 | 9711.0 | Sarasota | Sarasota | FL | 14566.50 | 97 |
| 3185 | 3768 | 4476.9 | 5062.8 | 3879.3 | 4504.5 | 4047.0 | 8109.0 | San Mateo | South San Francisco | CA | 12163.50 | 97 |
| 3186 | 3772 | 5549.1 | 7395.0 | 7934.4 | 8913.0 | 9235.5 | 15699.0 | Manatee | Bradenton | FL | 23548.50 | 97 |